Cross-talk and interference can enhance information capacity of a signaling pathway 
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A recurring motif in gene regulatory networks is transcription factors (TFs) that regulate each 
other, and then bind to overlapping sites on DNA, where they interact and synergistically control 
transcription of a target gene. Here, we suggest that this motif maximizes information flow in a noisy 
network. Gene expression is an inherently noisy process due to thermal fluctuations and the small 
number of molecules involved. A consequence of multiple TFs interacting at overlapping binding 
sites is that their binding noise becomes correlated. Using concepts from information theory we show 
that a signaling pathway transmits more information if 1) the noise of one input is correlated with 
that of the other, and 2) the input signals are not chosen independently. In the case of TFs, the latter 
criterion hints at up-stream cross-regulation. We explicitly demonstrate these ideas for the toy model 
of two TFs competing for the same binding site. We suggest that this mechanism potentially explains 
the motif of a coherent feed-forward loop terminating in overlapping binding sites commonly found 
in developmental networks, and discuss three specific examples. The systematic method proposed 
herein can be used to shed light on TF cross-regulation networks either from direct measurements 
of binding noise, or bioinformatic analysis of overlapping binding sites. 



I. INTRODUCTION 

Acurate transmission of information is of paramount 
importance in biology. For example in the process of em- 
bryonic development, crude morphogen gradients need 
to be translated into precise expression levels in every 
cell and sharp boundaries between adjacent ones QJH]. 
The embryo accomplishes this using a complex network 
of signaling molecules that not only regulate the expres- 
sion level of the desired output gene, but also each other. 
One simple strategy for increasing accuracy is the use 
of multiple input signals. Indeed, frequently, the expres- 
sion level of a single gene is controlled by multiple tran- 
scription factors (take for example bicoid and hunchback, 
or dorsal and twist in the Drosophila embryo [TJ 131 H])- 
These transcription factors, however, often have overlap- 
ping binding sites that result in interactions at binding 
and synergetic control of transcription [3l 0] . 

Here, we suggest that interaction at the level of bind- 
ing (interference) is related to the upstream network of 
transcription factors regulating each other (cross-talk). 
Our main assumption is that the regulatory network is 
designed to optimize information transfer from the in- 
put (TF concentrations) to the output (gene expression 
level). This is a reasonable assumption in the case of de- 
velopment, where accurate positional information needs 
to be extracted from noisy morphogen concentrations [5] . 

First, we define the concept of a cis-regulatory network 
as a noisy communication channel, where the input en- 
codes information by taking on a range of values, i.e. a 
morphogen gradient that carries positional information. 
Decoding this information is subject to biological noise; 
for example, at the molecular level, the stochastic bind- 
ing of morphogens to receptors makes an exact read-out 
of their concentration impossible. 
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We show that in general two input signals with cor- 
related read-out noise can transmit more information if 
they are not independent information carriers but cho- 
sen from an 'entangled' joint-distribution, i.e. the con- 
centration of one morphogen in a given cell is related to 
that of the other. Physically, this implies that the two 
inputs regulate each other upstream through cross-talk. 
We demonstrate this for the simplest possible case by 
writing a toy model of two TFs competing for the same 
binding site. The competition at the binding site results 
in correlated binding/unbinding fluctuations. Solving for 
the optimal joint-distribution of the input TF concentra- 
tions indicates that upstream, one is positively regulated 
by the other. Despite the increase in noise for each indi- 
vidual input from the competition, two interacting TFs 
can transmit more information than two non-interacting 
TFs because of 1) correlated noise in input read-out, 2) 
an entangled optimal input distribution. 

We suggest that this mechanism is consistent with the 
recurring strategy of the coherent feed-forward motif, 
where one TF positively regulates another, and both bind 
to partially overlapping sites that induce interactions. In 
particular, we discuss the joint regulation of gene Race 
in the Drosophila embryo by intracellular protein Smads 
and its target zen; regulation of even-skipped stripe 2 
by bicoid and hunchback; and that of snail by dorsal and 
twist. Generalization to other forms of cross-talk, such as 
cross-phospohorylation, and other forms of interference, 
such as use of scaffold proteins, is also discussed. 

Another purpose of this work is to communicate the 
broader importance of correlated noise in biological sys- 
tems. The conventional approach of treating intracellular 
noise as uncorrelated fluctuations might mask the func- 
tional importance of intricate network structures used in 
replication, gene regulation, transcription, and transla- 
tion. 
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II. GENE REGULATION AS A 
COMMUNICATION CHANNEL 

Regulatory networks in a cell are information process- 
ing modules that take in an input, such as concentration 
of a nutrient, and generate an output in the form of a gene 
expression level. Information in the input is typically en- 
coded as the steady-state concentration of a transcription 
factor c, which binds to the promoter site of the desired 
response gene and enhances or inhibits its transcription. 
At a molecular level, the process of binding is inherently 
noisy, subject to thermal agitations and low-copy num- 
ber fluctuations [5HZ]. The noise is captured through a 
probabilistic relationship between the binding site read- 
out n, and TF concentration c, p(n\c). Detailed form 
of p(n\c) depends on physical parameters such as bind- 
ing and unbinding rates. We can think of this process 
as communication across a noisy channel where read-out 
11 from the binding site does not correspond to exactly 
the input c. To alleviate impact of the noise, various 
strategies can be adopted, such as limiting the input to 
sufficiently spaced discrete concentration levels Cj that 
result in non-overlapping read-out distributions rij. 

Shannon's channel coding theorem 20, 21j tells us the 
maximum rate at which information can be communi- 
cated across a noisy channel, or the channel capacity. 
Throughout this work, we will assume that gene regula- 
tory networks are selected to optimize the rate of infor- 
mation transmission. This is a strong but reasonable as- 
sumption, for example, the cell will clearly benefit from a 
more accurate knowledge of the amount of nutrient in its 
environment. However, the cost of an optimal networks 
can exceed the benefit of more accurate information. In 
this work, we do not account for the cost of a network, 
the only metric for comparison is the channel capacity. 

With knowledge of the nature of the noise in a channel 
p(n\c), it is possible to compute the probability distribu- 
tion of the input signal P^ F (c) that maximizes rate of 
information transmission. Essentially, this distribution 
tells the sender how often a particular TF concentration 
should be used for optimal transmission of information 
encoded in concentration (take for example spacing of 
discretized inputs suggested above). However, it does 
not tell the sender anything about the encoding and de- 
coding schemes. This abstraction is useful, allowing us 
to compute the optimal input without having derived the 
optimal coding. However, the optimal coding might re- 
quire input blocks of infinite size and complex codebooks, 
with little biological relevance. 

Nevertheless, there are experimental observations con- 
sistent with the idea of regulatory systems maximizing 
information transmission rates. Tkacik et al. [8] have 
shown that experimental measurements of Hunchback 
concentration in early Drosophila embryo cells [TJ] has a 
distribution that closely matches the optimal frequency 
for the measured levels of noise in the system; with the 
system achieving 90% of its maximum transmission rate. 
Even for systems that are not optimized, channel capac- 



ity is a useful metric for comparison. A particular ar- 
chitecture of transcription factors and binding sites can 
potentially achieve a higher transmission rate than oth- 
ers. 

We consider the case of two transcription factors en- 
coding information in their concentrations C\ t 2 for regu- 
lation of a desired gene. First, we show that correlations 
in the read-out noise of the two transcription factors im- 
proves channel capacity. This is not surprising. Next, we 
write down a simple physical model of generating corre- 
lated read-out noise: competition for the same binding 
site. Competing transcriptional modules have been ex- 
tensively observed in both prokaryotes [9] and Eukary- 
otes [TO]. Competing for a binding site clearly increases 
noise in read-out of each TF individually. What is not 
clear a-prior is whether the gain from correlations can 
compensate for loss in accuracy of the individual read- 
out. We show that surprisingly competing transcription 
factors can transmit information at a higher rate. 

Essentially, this follows from fully exploring the 
optimal joint-distribution of the frequency of inputs 
Pj,p(ci,C2), which is no longer separable iyp(ci,t^) 7^ 
P*(ci)P 2 * (02) in the case of competing TFs. More impor- 
tantly, P^ F (ci, C2) reveals correlations between the input 
TFs, shedding light on their upstream interactions. With 
correlated noise at the read-out level, the channel is op- 
timized when the two transcription factors regulate each 
other, and are no longer independent information carri- 
ers. Below, we find that one TF ought to positively reg- 
ulate the other. Comparison to biological system reveals 
a predominance of the 'feed-forward' motif for transcrip- 
tion factors that compete for the same binding site, not 
inconsistent with our results. 



III. NOISE CORRELATIONS ENHANCE 
CAPACITY 

First, we quantify how correlations in read-out noise 
enhance rate of information transmission, following 
closely the approach of [II] . Consider two transcription 
factors with concentrations c\ and C2 that regulate ex- 
pression level of a gene (denoted as g) through receptor 
binding readouts ni,2- To carry information C1.2 must 
take on a range of values. These values can vary for 
example as a function of space, as in the case of mor- 
phogens along an embryo. The frequency of observing a 
particular concentration occurrence C\ and C2 is given by 
Ptf{c\, C2). The information content is maximized when 
this distribution is uniform, or all concentrations equally 
likely. Of course, our aim is not to maximize information 
content in c\^, but rather the information conveyed to 
the expression level g. 

For now, we neglect the details of the read-out mecha- 
nism (c to n) and focus on regulation control from ci^ to 
g. The noise in the expression levels results in a distribu- 
tion of g for fixed TF concentrations, P{g\c\, C2). Equiv- 
alently, we can fix the expression level g and consider the 
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corresponding distribution of TFs, P(c\,C2\g), assuming 
that there is unique set of inputs for every value of g. The 
two distributions are related by Bayes' rule. The amount 
of information communicated from ci ; 2 to g is given by 
the mutual information between the distributions of Ci >2 
and g [20], 

I(g;ci,c 2 ) = - J dcidc 2 PTF(ci,c 2 ) log P T f(ci,c 2 ) ■ ■ ■ 

+ J dgP exp (g) x J dc 1 dc 2 P{ci, c 2 \g) log P{c 1 ,c 2 \g§\) 

where the distribution of expression level g is given by 
Pex P (g) = J dcidc 2 P(g\c 1 ,c 2 )P TF (cx ) c 2 ). 

We assume that the noise in c\ t 2 for a fixed expression 
level g is small and distributed as a Gaussian around the 
mean value c(g), 



P(c 1 ,c 2 \g) 



2n 



exp 



i(c-c(. 9 )) T S- 1 (c-c( ff )) 



(2) 

where c = (ci,C2), and E is the covariance matrix over 
the conditional probability for fixed g, or the noise co- 
variance matrix, 



^ij(g) = <(cj - Ci(g))(cj - Cj{g))). 



(3) 



Essentially, the small-noise approximation says that it 
is meaningful to think of a mean input-output response, 
which is what is commonly measured in experiments. We 
expand around the mean response to the next order. The 
approximation, although strong, has been verified for a 
variety of regulatory systems (see for example Bicoid- 
Hunchback in [12 , or for other examples [SHU]), and 
enables us to analytically calculate the optimal distribu- 
tion. The mutual information under this approximation 
is given by, 

P\9\ci,c 2 ) = - J dddc 2 P TF (ci,c 2 ) log P T f(ci,c 2 ) . . . 



dcidc 2 PTF{ci,c 2 ) log ( - — A ^ 2 l 2 ' n (4) 
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where E _1 is evaluated at the mean value of expression 
level g corresponding to a given c. 

To find the channel capacity, Eq.fJlJ is optimized for 
the input distribution Ptf(ci,c 2 ). We constrain the in- 
put concentration to lie in the range ci j2 € [c m in, c rna x = 
1], where the maximum concentration is normalized to 
one. The minimum concentration is set by the molecular 
nature of the input: a minimum of one input molecule per 
cell is required. With the probability distribution's nor- 
malization constraint introduced using a Lagrange mul- 
tiplier, the optimal distribution must satisfy, 



SP T f(ci,c 2 ) 



I(g;c u c 2 ) - X J dcidc 2 PTF{ci,c 2 ) 



= 0. 
(5) 



The optimal input distribution in the small-noise approx- 
imation (Eq.Q) is given by, 



Pt F {ci,c 2 ) 



1 



1 



2neZ 



(6) 



where Z is the normalization constant. 

The maximum mutual information, or channel capac- 
ity for transmitting information from TF concentrations 
to expression level equals, 



I* = log 2 Z = log 2 



1 

2~7re 



dc\dc2 



1 



(7) 



We can repeat the same computation for one TF 
while neglecting the other, effectively ignoring the co- 
variance of the noise (off-diagonal components of E). 
With no covariance, the noise distribution is separable, 
P(ci,c 2 \g) = P{cx\g)P{c 2 \g). The optimal input concen- 
tration for TFi will be P q *(ci) ~ j= , and its channel 

1 x ' V^ll 

capacity, I* ~ log J dc\ ; with a similar expression 
for the other TF. 

For the simple case where E is independent of c, chan- 
nel capacity of the two TFs can be decomposed into its 
individual and joint contributions, 



I* =P l +P 2 -ilog(l-p 2 ), 



(8) 



where I£ 2 is the channel capacity of the transcription fac- 



tors individually, and p 



is the noise correlation 
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coefficient for TF concentrations. Accounting for noise 
correlation enhances the rate of information transmis- 
sion. In fact, in the limit of perfect correlation, p — > ±1, 
the capacity is infinite. This is expected, since under 
the small-noise approximation and perfectly correlated 
noise, some combination of inputs is always noise free. 
Noise-free continuous variables can transmit infinite in- 
formation. 

In general, the optimal input distribution with noise- 
correlation is not separable to individual components, 
namely, 



P} F (c 1)C2 ) ^ P^( Cl )P^(c 2 ), 



(9) 



where P^ 2 is the marginal distribution for c\^ 2 . In a 
sense, P^ F (ci, c 2 ) is an entangled distribution, where the 
concentration of one TF determines the probability of ob- 
serving a certain concentration of the other. Biologically, 
this hints at upstream interactions between the transcrip- 
tion factors; the form of which should be predictable from 
the nature of the noise correlations. 

The above abstract results are not surprising. The 
more important question is whether noise can be corre- 
lated, i.e. P{c\,c 2 \g) 7^ P(ci\g)P(c 2 \g), for the physical 
process of binding and unbinding of multiple TFs to a 
promoter region. We will demonstrate this below using 
competing transcription factor modules. 
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IV. COMPETING TRANSCRIPTION FACTORS 

Transcription factors regulate gene expression levels by 
binding to cis-regulatory regions on the DNA. The design 
of these regions is highly complex in both prokaryotes and 
eukaryotes, with overlapping TF binding sites occurring 
frequently [H HUJ [30] . 

We write a simplified model of the two extremes of 
overlapping binding sites (Fig.l). Fig.lA depicts two TFs 
interacting independently with their corresponding bind- 
ing site, whereas, Fig. IB depicts the two TFs competing 
for the same site. In a gross simplification, we assume 
that the dominant source of noise is from fluctuations 
in binding/unbinding of TF to the promoter. Details of 
RNAP assembly and transcription are coarse grained to a 
simple TF binding picture. The discrete nature of output 
and its stochastic degradation, along with fluctuations in 
input concentrations are discarded. Nonetheless, we will 
show that this simple model captures the essential role 
of noise correlations in a regulatory network. 

Following the approach of [T7], let ni,2 be the fractional 
occupation of the binding site by TFi^- ni+ri2 < 1 is the 
fractional occupation of the site by either TF. A binding 
event can occur only if the site is unoccupied, 1 — ri\ — ri2 
of the time. 



dni(t) 

dt 
dn 2 {t) 

dt 



fciCi(l — n\ — 112) — l\n\ 

^202(1 — ni — n 2 ) — l%n 2 (10) 



The binding rate (on-rate) is proportional to the concen- 
tration of TF present, and the off-rates given by con- 
stants l\^2- At thermal equilibrium these two rates, 
are related through the principle of detailed balance, 
^y^ 1 = exp ( -j^pj; ) , where F\ is the free energy gain in 
binding for TF 1; with a similar expression for the other 
TF. Throughout, this section we rescale time so that 
k = 1. As per previous section, TF concentrations are 



measured in units of c„ 



1. 



Eq.(10) is a dynamical picture of the fractional occu- 



pation of the binding site by each TF. At steady state 
the mean fractional occupation is denoted by ni 2 . We 
incorporate thermal fluctuations around this mean by in- 
troducing small fluctuations in the rate constants Ski^ 
and 5/1,2) and linearizing fluctuations in Sri\^ around the 
mean. The TF concentrations do not fluctuate; they are 
the fixed inputs of the system. Fluctuations in n.1,2 ef- 
fectively introduce noise in the estimates of the concen- 
trations, p(n|c). 

From detailed balance, we know that 8k and 51 fluctu- 
ations around steady state can be replaced by the fluc- 
tuations in free energy SF alone. With this substitution 
and taking the Fourier transform, the linearized equa- 
tions take the form, 



k R T 



1 



feici 



1 — Til 



n 2 




A 



c 



9 



r 



1 



mRNA 



ci -> p(rai|ci) -S- 
c 2 -» p(n 2 \c 2 ) ->■ 



^9 



B 



RNAP^ 



■ p(ni,n 2 \ci,c 2 ) 
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FIG. 1. Independent vs. competing transcription factors. A) 
Non-overlapping binding sites. The read-outs 711,2 are not 
correlated, so is the noise in estimating ci^- The expression 
level g is dependent on both inputs. B) Over-lapping binding 
sites. The read-outs are dependent, resulting in correlated 
noise in estimating ci and ci- 



where tilde denotes the Fourier-transform, i.e. Sh\{uS) = 
J °° Sn{t)e lu>t dt. In vectorial form, the relation becomes, 
SF = ASh. 

Eq. (jTTj) relates incremental fluctuations in read-out 5n 
with fluctuations in free energy SF. This is a linear re- 
sponse relation, with the free energy playing the role of 
the driving force (for details, see [E]). Using fluctua- 
tion dissipation theorem [TS], we calculate the power- 
spectrum of noise in n. 



S n (uj) 



Ul 



(12) 



with denoting the imaginary part of the inverse of A 
matrix computed above. From 5 1 , we can compute the 
covariance matrix, 



(Sn T Sn) = 



X ' T du 



l/r 



2tt 



S n (u>) 



(13) 



r denotes the integration time of the site. The read-out 
noise decreases with an increase in integration time. In 
our calculations, we assume very short integration time, 
namely l/r — > 00. The off-diagonal element of (Sn T Sn) 
is the covariance between the two read-outs. With proper 
normalization, we can compute the correlation coefficient 
(Fig. 2A). The correlation-coefficient is negative, since a 
more than expected occupation of the site by one TF 



FIG. 2. Correlation coefficients. (A) Correlation coefficient 
of readouts n\ and 712, for I — 10 -4 and c m in = 10 -3 as 
a function of log input TF concentration. (B) Correlation 
coefficient of the TF concentration estimates for the same 
parameter values. 



FIG. 3. Optimal input distribution and channel capacity. (A) 
The optimal input distribution for I = 10 -4 and c m ; n = 1CP 3 
as a function of log TF concentration. (B) The channel ca- 
pacity in bits for the interacting and non-interacting case of 
two TFs (dashed-brown and solid-red curves respectively) as 
a function of logarithm of rescaled The y-offset is arbitrary. 
Blue curve denotes their difference. At biologically relevant 
/ = 10~ 4 , interacting TFs have higher channel capacity. 



will clearly result in less than expected occupation by 
the other. 

Finally, we need to relate the noise in the read-out 8n 
to the noise in the estimated TF concentrations. To do 
so, we account for the sensitivity of the read-out to the 
TF concentrations. For example, a very large c\ results 
in n\ = 1 with little noise. This readout, however, is not 
very sensitive to changes in ci , and not useful in detecting 
concentration changes. Define the matrix, fly = 
The covariance matrix for the noise in TF concentrations 
is given by, 

E = (Sc T Sc) = n(Sn T dn)n T . (14) 

In equating the covariance matrix in TF concentration to 
E (covariance matrix for a fixed g) of the previous sec- 
tion, we have introduced the extra assumption that the 
dominant noise in the channel going from c to g is from 
the read-out noise and not the expression-level. Noise 
in g is assumed negligible and need not be propagated 
backwards and included in E. Since noise in g is most 
commonly shot-noise, this assumption is reasonable when 
expression-levels are high. This also means that our re- 
sults will not depend on the functional form of g on c (for 
the case when they do for one input see [HJQ2]). Fig.2B 
shows the correlation coefficient for the noise in TF con- 
centrations. The correlation coefficient is now positive. 

Assuming the read-out noise computed above is the 
dominant source of noise, we can compute the optimal 
joint-distribution of input concentration P TF (c\, C2) by 
plugging the covariance matrix in Eq.|6| (Fig. 3A). 
Moreover, Eq.Q tells us the channel capacity, or the 
maximum information transmission rate. Fig. 3B plots 
channel capacity of two interacting TFs and two indepen- 
dent ones as a function of logarithm of off-rate log 10 (Z). 
The interacting TFs have a higher channel capacity in 
the biologically relevant regime where I ~ 10~ 4 and 
Cmin ~ (see below). 

The channel capacity is higher for interacting TFs, 
despite an increase in the noise of individual readouts, 
because the optimal join-distribution of input concen- 
trations (Fig. 3A) is 'entangled' and no longer sep- 



arable, P^ F (ci,c 2 ) 7^ P 1 *(ci)P2 (c 2 ). With an entan- 
gled distribution the system can explore degrees of free- 
dom not present with two independent input distribu- 
tions. In Fig. 4, we plot the log likelihood of observ- 
ing joint concentration (01,02) compared to observing c\ 
and C2 independently from their marginal distributions. 

R = lo gio pf(^)P;(lv where P * ( Cl ) = Jdc2P^ F (c 1 ,c 2 ) 
is the marginal distribution of TF l5 with a similar ex- 
pression for TF 2 . 

Fig. 4 implies that the two TFs are no longer passive 
and in fact interact with each other. It is ~10 times less 
likely to observe one TF at a high concentration and the 
other at a low concentration simultaneously, compared to 
what is expected if they were independent. Similarly, it 
is ~10 times more likely to observe high concentrations 
of one TF if the concentration of the other is also high. 
This suggests that one TF positively regulates the other 
(see motif in Fig. 4). We assume that the noise intro- 
duced in TF concentration from cross-regulation is small 
compared to the noise in the read-out during expression. 
This is reasonable if the physical mechanism of cross-talk 
is different, or the cell is allowed longer integration times 
for TF regulation than that of the target gene. 



V. BIOLOGICAL EXAMPLES 

Where does a biological system lie in the abstract 
parameter space sketched above? As noted, we have 
rescaled time so that k = 1, and measured concentra- 
tion in units of c max = 1. With the assumption of fast 
integration time, the only parameters left are the off-rate 
I and c m i n . In a real cell, we expect a absolute maximum 
of roughly 1000 TF molecules (or a dynamic range of 1- 
1000 TF molecules) in a volume of ~ 1 /1m 3 [23] ■ Hence, 
the minimum allowed concentration is c m in = 10~ 3 . A 
typical equilibrium constant of TF binding to DNA is 
K eq ~ 10 10 [22 ]. Putting all this together, we find 
I ~ 10~ 4 . It is possible then that a real biological regula- 
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FIG. 4. TF upstream cross-regulation. The log likelihood 
of observing TF concentration (ci, C2) compared to what is 
expected from independent distributions. It is much more 
likely to observe both TFs at either low or high concentrations 
together. This suggests that one TF positively regulates the 
other, feed- forward motif (right). 



interactions, dl directly regulates transcription of twi 
upstream. 

More generally, other forms of cross-talk besides tran- 
scriptional regulation can be used. For instance, in regu- 
lation of anaerobic respiration in E. coli, regulators NarP 
and NarL are jointly-regulated through phosphorylation 
by histidine kinase NarQ. NarL is also phosphorylated 
by kinase NarX. Downstream, NarP and NarL share the 
same DNA binding site 28J. It is not, however, clear if 
optimizing channel capacity is relevant for this system. 
It is also possible that interference is implemented using 
other schemes than DNA binding, for example, through 
cooperative interactions of signaling molecules with scaf- 
fold proteins [25] . 



tory system can transmit more information by incorpo- 
rating overlapping binding sites and an upstream positive 
feedback between the TFs. 

Concentration-dependent transcription regulation is 
particularly important at the developmental stage. Con- 
centration of morphogens dictate cell fate, for example, 
resulting in patterning of the Drosophila embryo along 
the drosoventral axis [35] ■ It is likely that the embryo 
has optimized information transmission to ensure accu- 
rate patterning and later development. Gene regulation 
using a combination of transcription factors is also a com- 
mon theme in development (see for example [26 ). 

Xu et al. j24! have observed the feed-forward motif in 
regulation of gene Race in the Drosophila embryo. They 
report that intracellular protein Smads sets the expres- 
sion level of zerknilllt (zen), then Smads in combination 
with zen (two-fold input) directly activate Race. Analy- 
sis of binding site of Smads and zen reveals slight over- 
laps, and experiments indicate that one protein facilitates 
binding of the other to the enhancer. This interaction can 
result in a similar positive correlation coefficient in TF 
concentration estimates derived in the toy model above. 
The feed-forward motif then is predicted by the frame- 
work outlined in last section. The previously proposed 
suggestion [21] that feed-forward motif increases sensitiv- 
ity to the input signal does not explain why the target is 
regulated by both the initial input and the target tran- 
scription factor. Proposed dynamical features associated 
with the feed- forward loop [HI HZ] do not explain the need 
for overlapping binding sites and TF interactions at bind- 
ing. 

Another example of a feed-forward motif coupled 
to binding interactions is the joint-regulation of even- 
skipped (eve) stripe 2 by bicoid (bed) and hunchback (hb). 
Small et al. [5] report cooperative binding interactions 
between bed and hb and a clustering of their binding sites 
in the promoter region. Upstream, bed positively regu- 
lates transcription of hb. Similarly, Ip et al. [4] have ob- 
served joint-activation of gene snail (sna) by twist (twi) 
and dorsal (dl), which also exhibit cooperative binding 



VI. DISCUSSION 

Using a simple model, we have shown that TF inter- 
action at overlapping binding sites plus upstream cross- 
regulation can enhance information transmission com- 
pared to non-interacting TFs. We motivated the frequent 
observation of the feed-forward motif ending in overlap- 
ping binding sites in developmental gene networks. Al- 
though the feed-forward motif has been proposed be- 
fore for optimizing information transmission in regula- 
tory networks |16j . we emphasize that our approach is 
fundamentally different, since it stems from correlated 
binding noise, and physically requires existence of TF in- 
teractions at the binding level. This is indeed what is 
experimentally observed in the three examples discussed 
above. We also hope to convey the importance of careful 
analysis of correlated noise in biological systems. The 
conventional approach of treating intracellular noise as 
uncorrelated fluctuations might not reveal intricate con- 
nections in the complex networks involved in regulation, 
transcription, and translation. 

A myriad of logical regulatory circuits have been pro- 
posed through use of overlapping binding sites and in- 
teracting TFs [23] SO]- It is worthwhile to see if the 
upstream TF regulatory network of these systems can 
be correctly predicted from the overlap/interactions us- 
ing the methodology outlined above. This will shed light 
on whether optimization of channel capacity is biologi- 
cally relevant for gene regulation. Such analysis requires 
knowledge of the binding noise, which can be obtained in 
two ways. One is a bioinformatics approach, where the 
binding sequence of each TF is examined for overlap. A 
physical model is then required to make predictions on 
the nature of the noise from the overlap (a toy exam- 
ple of which was presented here) and possibly protein- 
protein interactions. The other approach requires mea- 
surement of the noise directly using single-molecule tech- 
niques [31 j . and use in Eq.(|6| to predict upstream TF 
cross-regulation. 



7 



ACKNOWLEDGMENTS 



The author thanks Boris Shraiman for helpful discus- 
sions and critical reading of the manuscript, and Bill 
Bialek for introduction to the subject. This research was 
supported in part by the National Science Foundation 
under Grant No. NSF PHY05-51164. 



[1] Wolpert L, et al. (2006) Principles of Development (Ox- 
ford Univ Press), 3rd Ed. 

[2] Wolpert L (1969) Positional information and the spatial 
pattern of cellular differentiation. J Theor Biol 25:1-47. 

[3] Small S, Blair A, Levine M (1992) Regulation of even- 
skipped stripe 2 in the Drosophila embryo. The EMBO 
Journal 11:4047-4057. 

[4] Ip YT, Park RE, Kosman D, Yazdanbakhsh K, Levine M 
(1992) dorsal-twist interactions establish snail expression 
in the presumptive mesoderm of the Drosophila embryo. 
Genes Dev 6:1518-1530. 

[5] Elowitz MB, Levine AJ, Siggia ED, Swain PS (2002) 
Stochastic gene expression in a single cell. Science 
297:1183-1186. 

[6] Swain PS, Elowitz MB, Siggia ED (2002) Intrinsic and 
extrinsic contributions to stochasticity in gene expres- 
sion. Proc Natl Acad Sci USA 99:12795-12800. 

[7] Paulsson J (2004) Summing up the noise in gene net- 
works. Nature 427:415-418. 

[8] Tkacik G, Callan CG, Bialek W (2008) Information flow 
and optimization in transcriptional regulation. Proc Natl 
Acad Sci USA 105:12265-12270. 

[9] Shen-Orr SS, Milo R, Mangan S, Alon U (2002) Net- 
work motifs in the transcriptional regulation network of 
Escherichia coli. Nat Genet 31:64-68. 
[10] Lee TI, et al. (2002) Transcriptional regulatory networks 

in Saccharomyces cerevisiae. Science 298:799-804. 
[11] Tkacik G, Callan CG, Bialek W (2008) Information 
capacity of genetic regulatory elements. Phys Rev E 
78:11910. 

[12] Gregor T, Tank DW, Wieschaus EF, Bialek W (2007) 
Probing the limits to positional information. Cell 
130:153-164. 

[13] Raser JM, O'shea EK (2004) Control of stochasticity in 
eukaryotic gene expression. Science 304:1811-1814. 

[14] Newman JR, et al. (2006) Single-cell proteomic analysis 
of S. cerevisiae reveals the architecture of biological noise. 
Nature 441:840-846. 

[15] Rosenfeld N, et al. (2005) Gene regulation at the single- 
cell level. Science 307:1962-1965. 

[16] Walczak AM, Tkacik G, Bialek W (2010) Optimizing in- 
formation flow in small genetic networks. II. Feed-forward 



interactions. Phys Rev E 81:041905. 
[17] Bialek W, Setayeshgar S (2005) Physical limits to bio- 
chemical signaling. Proc Natl Acad Sci USA 102:10040- 
10045. 

[18] Kubo R (1966) The fluctuation-dissipation theorem. Rep 

Prog Phys 29:255-284. 
[19] Berg HC, Purcell EM (1977) Physics of chemoreception. 

Biophys J 20:193-219. 
[20] Shannon CE (1949) Communication in the presence of 

noise. Proc IRE 37:10-21. 
[21] Cover TM, Thomas JA (1991) Elements of Information 

Theory (Wiley, New York) . 
[22] Bintu L , et al. (2005) Transcriptional regulation by the 

numbers: Applications. Curr Opin Genet Dev 15:125- 

135. 

[23] Buchler NE , Gerland U , Hwa T (2003) On schemes 
of combinatorial transcription logic. Proc Natl Acad Sci 
USA 100:5136-5141. 

[24] Xu M, Kirov N, Rushlow C (2005) Peak levels of BMP 
in the Drosophila embryo control target genes by a feed- 
forward mechanism. Development 132:1637-1647. 

[25] Morisato D, Anderson KV (1995) Signaling pathways 
that establish the dorsalventral pattern of the Drosophila 
embryo. Annu. Rev. Genet. 29:371-99. 

[26] Howard ML, Davidson EH (2004). cis- Regulatory control 
circuits in development. Dev. Biol. 271:109-118. 

[27] Mangan S, Alon U (2003) Structure and function of the 
feed-forward loop network motif. Proc Natl Acad Sci 
USA 100:11980-11985. 

[28] Stewart V (2003) Biochemical Society Special Lecture. 
Nitrate- and nitrite-responsive sensors NarX and NarQ 
of proteobacteria. Biochem. Soc. Trans. 31:1-10. 

[29] Good MC, Zalatan JG, Lim WA (2011) Scaffold Proteins: 
Hubs for Controlling the Flow of Cellular Information. 
Science 332:680-686. 

[30] Hermsen R, Tans S, tenWolde P-R (2006) Transcriptional 
Regulation by Competing Transcription Factor Modules. 
PLoS Comp. Biol. 2:el64. s 

[31] Yufang Wang, Ling Guo, Ido Golding, Edward C. Cox, 
and N. P. Ong Quantitative Transcription Factor Binding 
Kinetics at the Single-Molecule Level. Biophysical Jour- 
nal Volume 96:609-620. 



